Towards improving ASR robustness for PSN and GSM telephone applications

نویسندگان

  • Chafic Mokbel
  • Laurent Mauuary
  • Lamia Karray
  • Denis Jouvet
  • Jean Monné
  • Jacques Simonin
  • Katarina Bartkova
چکیده

In real-life applications, errors in the speech recognition system are mainly due to inefficient detection of speech Ž . segments, unreliable rejection of Out-Of-Vocabulary OOV words, and insufficient account of noise and transmission channel effects. In this paper, we review a set of techniques developed at CNET in order to increase the robustness to mismatches between training and testing conditions. These techniques are divided in two classes: preprocessing techniques Ž . and Hidden Markov Models HMM parameters adaptation. The results of several experiments carried out on field databases, as well as on databases collected over PSN and GSM networks are presented. The main sources of errors are analyzed. We show that a blind equalization scheme significantly improves the recognition accuracy regarding both field and GSM data. Speech detection allows a system to delimit the boundaries of the words to be recognized. We also use preprocessing techniques to increase the robustness of such detectors to noisy GSM speech. We show that spectral subtraction improves speech detection under noisy GSM conditions. Bayesian adaptation of HMM parameters produces models which are robust to field and GSM conditions. Models robust to GSM conditions can also be generated by linear regression adaptation of HMM parameters. Our experiments show an equivalent performance obtained with both Bayesian and linear regression adaptation of HMM parameters. The results obtained also prove that HMM adaptation and Ž . preprocessing techniques can be advantageously combined to improve Automatic Speech Recognition ASR robustness. q 1997 Elsevier Science B.V.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Research and Development of Robust Speech Recognition

This paper describes recent research and development activities on robust ASR (automatic speech recognition) in NTT Human Interface Laboratories. ASR system design has been changing from the experimental to the commercial level. A relevant issue in achieving practical ASR is robustness against environmental noise and speaker/circuit differences. Adaptation techniques have been widely investigat...

متن کامل

Signal bias removal using the multi-path stochastic equalization technique

We propose using Hidden Markov Models (HMMs) associated with the cepstrum coefficients as a speech signal model in order to perform equalization or noise removal. The MUlti-path Stochastic Equalization (MUSE) framework allows one to process data at the frame level: it is an on-line adaptation of the model. More precisely, we apply this technique to perform bias removal in the cepstral domain in...

متن کامل

Audio-Visual Automatic Speech Recognition: An Overview

We have made significant progress in automatic speech recognition (ASR) for well-defined applications like dictation and medium vocabulary transaction processing tasks in relatively controlled environments. However, ASR performance has yet to reach the level required for speech to become a truly pervasive user interface. Indeed, even in “clean” acoustic environments, and for a variety of tasks,...

متن کامل

Automatic recognition of child speech for robotic applications in noisy environments

Automatic speech recognition (ASR) allows a natural and intuitive interface for robotic educational applications for children. However there are a number of challenges to overcome to allow such an interface to operate robustly in realistic settings, including the intrinsic difficulties of recognising child speech and high levels of background noise often present in classrooms. As part of the EU...

متن کامل

Avoiding distortions due to speech coding and transmission errors in GSM ASR tasks

In this paper, we have extended our previous research on a new approach to ASR in the GSM environment. Instead of recognizing from the decoded speech signal, our system works from the digital speech representation used by the GSM encoder. We have compared the performance of a conventional system and the one we propose on a speaker independent, isolateddigit ASR task. For the half and full-rate ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 23  شماره 

صفحات  -

تاریخ انتشار 1997